Lj Miranda

Hi! I'm Lj Miranda, and welcome to my website!

I'm currently a predoctoral researcher at the AllenNLP team at Ai2. In the past, I've worked as an engineer, consultant, and researcher, mostly in the fields of NLP and AI.

I'm broadly interested in data-centric approaches to building language technologies at scale. I'm happy to discuss research and collaborate, so feel free to reach out!

What's New?

May 2025: Excited to share that I have three first & co-first author papers accepted at ACL Main: HyPER, M-RewardBench, and UD-NewsCrawl. A large collab project, SEA-VL, also got into Main!

Nov 2024: Happy to have been part of the exciting Tülu 3 and OLMo 2 releases! My primary contribution is scaling-up our preference data using a synthetic on-policy pipeline that led to improvements in our DPO models.

Oct 2024: Our paper on routing preference instances to human or LM annotators, Hybrid Preferences, is now available. This is the first work I co-led (with Yizhong Wang) at Ai2!

Oct 2024: Our paper on evaluating reward models in multilingual settings, M-RewardBench, is now available. This was a fun collab with folks from Cohere for AI!

Sep 2024: My cross-institutional collabs, Consent in Crisis and SEACrowd, were accepted to NeurIPS D&B and EMNLP 2024, respectively.

Aug 2024: 🏆 Our work on evaluating reward models in multilingual settings won Silver Prize in Cohere for AI’s Aya Expedition!

Jul 2024: I gave a guest lecture at DLSU about building Filipino NLP resources. Thanks to Dr. Charibeth Cheng for inviting me!

Recent Posts

Desiderata for Filipino NLP in the Age of LLMs

Guest lecture @ DLSU Manila: Artisanal Filipino NLP Resources in the time of Large Language Models

A lexical view of contrast pairs in preference datasets

What's New?